-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(backend,prepro): use etag to reduce database load #2768
Conversation
cda70e5
to
031a4bb
Compare
147be35
to
7016c49
Compare
a870345
to
ef6ca41
Compare
7016c49
to
b26c920
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Only two places I am a bit unsure about.
Also does this mean that the database query in fetchUnprocessedEntriesAndUpdateToInProcessing
was just really inefficient? Because the main change that this PR introduces is that if there has not been a change in the database we do not perform this complicated select entries in sequence_entries where ... not in preprocessing
and do not stream an empty response.
backend/src/main/kotlin/org/loculus/backend/controller/SubmissionController.kt
Show resolved
Hide resolved
0ad25fb
to
e2e4d7b
Compare
backend/src/main/kotlin/org/loculus/backend/model/ReleasedDataModel.kt
Outdated
Show resolved
Hide resolved
backend/src/main/kotlin/org/loculus/backend/controller/SubmissionController.kt
Outdated
Show resolved
Hide resolved
d968edc
to
009041e
Compare
co-authored with @anna-parker and Claude Sonnet try now it might work? Remove views fixup Change to a conditional update for useNewerProcessingPipelineIfPossible, add last_updated_db as a view. (#2741) Co-authored-by: Anna (Anya) Parker <50943381+anna-parker@users.noreply.github.com> Still return current newest version Return version from correct place Don't issue delete statement unless there's something to delete fix ktlint Adjust useNewerProcessingPipelineIfPossible to only update table when there are changes Table update tracker anya (#2764) * Add option to get-released-data to check last update time. * make name better * Make or short-circuiting * Only work with table_update_tracker - return lastTime in LAST_MODIFIED header * Remove unused functions * add back missing imports - somehow format removed this??? * Update bash scripts * Attempt to fix bash * 2nd little fix * echo value of last_modified not name * Do not update SILO if loculus responds with 304. * Always echo $lastSnapshot unless new data existed in loculus AND this was successfully sent to LAPIS - then echo $last_modified * save last snapshot to file * Update with loculus data anyways if longer than 1hour since last update * fix * Improve silo import scripts etc * add back BACKEND_BASE_URL * Log headers * Regularize * Better logs * Try to fix error * return 0 and never null in header * Fix read non-existent file bug * Better wrapper logging * fix delete after update snapshot * Make grep case insensitive * add back removed exit 0 * Improvement to logging * Add test * Add full process to test * Update documentation --------- Co-authored-by: Cornelius Roemer <cornelius.roemer@gmail.com> CR edits to table-update-tracker (#2767) * Add option to get-released-data to check last update time. * make name better * Make or short-circuiting * Only work with table_update_tracker - return lastTime in LAST_MODIFIED header * Remove unused functions * add back missing imports - somehow format removed this??? * Update bash scripts * Attempt to fix bash * 2nd little fix * echo value of last_modified not name * Do not update SILO if loculus responds with 304. * Always echo $lastSnapshot unless new data existed in loculus AND this was successfully sent to LAPIS - then echo $last_modified * save last snapshot to file * Update with loculus data anyways if longer than 1hour since last update * fix * Improve silo import scripts etc * add back BACKEND_BASE_URL * Log headers * Regularize * Better logs * Try to fix error * return 0 and never null in header * Fix read non-existent file bug * Better wrapper logging * fix delete after update snapshot * Make grep case insensitive * add back removed exit 0 * Improvement to logging * Add test * Add full process to test * Update documentation * Use etag instead of modified-since for spec compliance * ktlin fix * Move logic into model * Fix etag formatting * Fix etag yet again --------- Co-authored-by: Anna (Anya) Parker <50943381+anna-parker@users.noreply.github.com> refactor(backend): streamline control flow refactor(backend): extract function Update backend/src/main/kotlin/org/loculus/backend/controller/SubmissionController.kt Co-authored-by: Fabian Engelniederhammer <92720311+fengelniederhammer@users.noreply.github.com> Update backend/src/main/kotlin/org/loculus/backend/controller/SubmissionController.kt Co-authored-by: Fabian Engelniederhammer <92720311+fengelniederhammer@users.noreply.github.com> Update backend/src/test/kotlin/org/loculus/backend/controller/submission/GetReleasedDataEndpointTest.kt Co-authored-by: Fabian Engelniederhammer <92720311+fengelniederhammer@users.noreply.github.com> Enable etag for /extract-unprocessed-data Use etag in prepro to skip preprocessing match case refactor Try fix fix logic fix! i'm stupid Force refresh every hour Add response code annotations Update preprocessing/nextclade/src/loculus_preprocessing/prepro.py Co-authored-by: Anna (Anya) Parker <50943381+anna-parker@users.noreply.github.com> Add etag to dummy preprocessing # Conflicts: # backend/src/main/kotlin/org/loculus/backend/controller/SubmissionController.kt # backend/src/main/kotlin/org/loculus/backend/controller/SubmissionControllerDescriptions.kt # backend/src/main/kotlin/org/loculus/backend/model/ReleasedDataModel.kt # backend/src/main/kotlin/org/loculus/backend/service/submission/SubmissionDatabaseService.kt # backend/src/main/kotlin/org/loculus/backend/service/submission/UpdateTrackerTable.kt # backend/src/main/resources/db/migration/V1.2__add_table_update_tracker.sql # backend/src/test/kotlin/org/loculus/backend/controller/submission/GetReleasedDataEndpointTest.kt # kubernetes/loculus/silo_import_job.sh
2d5e0de
to
f153c39
Compare
relates to #2092
preview URL: https://etag-prepro.loculus.org
Summary
Send etag to /extract-unprocessed-sequences to reduce db/backend load
Screenshot
Big reduction in db and backend load, approximately to 1/4 to 1/3 of the previous base load
PR Checklist